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Abstract The importance of algorithm portfolio techniques for SAT has long been 
noted, and a number of very successful systems have been devised, including the 
most successful one — SATzilla. However, all these systems are quite complex 
(to understand, reimplement, or modify). In this paper we present an algorithm 

^~J , portfolio for SAT that is extremely simple, but in the same time so efficient that 

'^'n ' it outperforms SATzilla. For a new SAT instance to be solved, our portfolio finds 

C/3 , its fc-nearest neighbors from the training set and invokes a solver that performs 

, ^/ the best for those instances. The main distinguishing feature of our algorithm 

portfolio is the locality of the selection procedure — the selection of a SAT solver 

P\J I is based only on few instances similar to the input one. An open source tool that 

^ ' implements our approach is publicly available. 

00 : 

\^ , Keywords algorithm portfolios, SAT solving 
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C^^ ' 1 Introduction 

o. 

Solving time for a SAT instance can significantly vary for different solvers. There- 
fore, for many SAT instances, availability of different solvers may be beneficial. 
This observation leads to algorithm portfolios which, among several available 
k>( I solvers, select one that is expected to perform best on a given instance. This 

5_^ ■ selection is based on data about the performance of available solvers on a large 

C^ ' training set of instances. The problem of algorithm portfolio is not limited only 
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to the SAT problem, but can be considered in general (Huberman, Lukose, & 
Hogg, 1997; Gomes & Selman, 2001; Horvitz, Ruan, Gomes, Kautz, Selman, & 
Chickering, 2001). 

There are a number of approaches to algorithm portfolios for SAT, the most 
successful one being SATzilla (Xu, Hutter, Hoos, & Leyton-Brown, 2008) that 
regularly wins in various categories of SAT Competitions^. SATzilla is very suc- 
cessful, but is a rather complex machinery not easy to understand, reimplement 
or modify. In this paper we present an algorithm portfolio system, based on the 
fc-nearest neighbors method, that is conceptually significantly simpler and more ef- 
ficient than SATzilla. It derives from our earlier research on solver policy selection 
(Nikolic, Marie, & Janicic, 2009). 

The rest of the paper is organized as follows. In Section 2, some of the ex- 
isting algorithm portfolios are described. In Section 3, the proposed technique is 
described and in Section 4 the experimental results are presented. The conclusions 
are drawn in Section 5. 



2 Algorithm Portfolios for SAT 

Various approaches to algorithm portfolio for SAT and related problems have been 
devised (Gomes & Selman, 2001; Horvitz et al., 2001; Lagoudakis & Liftman, 
2001; Samulowitz & Memisevic, 2007), but the turning point in the field has been 
marked by the appearance of SATzilla portfolio (Nudelman, Brown, Hoos, De- 
vkar, & Shoham, 2004; Xu et al., 2008). Here we describe several recent rele- 
vant approaches for algorithm selection for SAT, most of them using fragments of 
SATzilla methodology. 

SATzilla. SATzilla, the algorithm portfolio that has been dominating recent SAT 
Competitions, is the most important and the most successful algorithm portfolio 
for SAT, with admirable performance (Xu et al., 2008; Xu, Hutter, Hoos, & Leyton- 
Brown, 2009). SATzilla represents instances by using different features and then 
predicts runtime of its constituent solvers based on these features and relying on 
empirical hardness models obtained during the training phase. 

SATzilla is a very complex system. On a given input instance, SATzilla first 
runs two presolvers for a short amount of time, in a hope that easy instances will 
be quickly dispatched. If an instance is not solved by the presolvers, its features 
are computed. Since the feature computation can take too long, before comput- 
ing features, the feature computation time is predicted using empirical hardness 
models. If the estimate is more than 2 minutes, a backup solver is run. Otherwise, 
using computed features, a runtime for each component solver is predicted. The 
solver predicted to be the best is invoked. If this solver fails (e.g., if it crashes or 
runs out of memory) , the next best solver is invoked. 

The training data are obtained by measuring the solving time for all instances 
from some chosen training set by all solvers from some chosen set of solvers (us- 
ing some predetermined cutoff time) . For each category of instances (used in SAT 
Competitions) — random, crafted, and industrial, a separate SATzilla system is 
built. For each system, a hierarchical empirical hardness model for each solver 
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is trained to predict its runtime. This prediction is obtained by combining run- 
time predictions of separate conditional models for satisfiable and for unsatisfiable 
instances. To enable this, SATzilla uses an estimator of probability whether the 
input instance is satisfiable that is trained using sparse multinomial logistic regres- 
sion. Each conditional model is obtained in the following manner. First, starting 
from a set of base features, a feature selection step is performed in order to find 
features that maximally reduce the model training error. Then, the pairwise prod- 
ucts of the remaining features are added as new features, and the second round 
of feature selection is performed. Finally, the ridge regression model for runtime 
prediction is trained using the selected features. From the set of solvers that have 
been evaluated on the training data, best solvers are chosen for the component 
solvers automatically, using a randomized iterative procedure. The presolvers and 
the backup solver are also selected automatically. 



ArgoSmArT. ArgoSmArT is a system developed for instance-based selection of 
policies for a single SAT solver (Nikolic et al., 2009). As a suitable underlying SAT 
solver it uses a modular solver ArgoSAT (Marie, 2009). ArgoSmArT uses a training 
set of SAT instances divided manually in classes of instances of similar origin (e.g., 
as in the SAT Competition corpus). Each instance is represented using (a subset 
of) the SATzilla features. For the input instance to be solved, the feature values 
are computed and the nearest (with respect to some distance measure) neighbor 
instance from the training set, belonging to some class c is found. Then, the input 
instance is solved using the solver configuration that is known to perform best on 
the class c. 

ArgoSmArT does not deal with solver tuning and assumes that good con- 
figurations for classes are provided in advance. This approach could be used for 
selection of policies for other solvers, too. Moreover, it can be also used as an 
algorithm portfolio. 

IS AC. IS AC is a solver configurator that also has the potential to be applied to the 
general problem of algorithm portfolio (Kadioglu, Malitsky, Sellmann, & Tierney, 
2010). It divides a training set in families automatically using a clustering tech- 
nique. It is integrated with GGA (Ansotegui, Sellmann, & Tierney, 2009), a system 
capable of finding a good solver configuration for each family. The instances are 
represented using SATzilla features, but scaled in the interval [-1,1]. For an input 
instance, the features are computed and the nearest center of available clusters 
is found. If the distance from the center of the cluster is less than some thresh- 
old value, the best configuration for that cluster is used for solving. Otherwise, a 
configuration that performs the best on the whole training set is used. 

Latent class models. Another recent approach promotes use of statistical models 
of solver behavior {latent class models) for algorithm selection (Silverthorn & Mi- 
ikkulainen, 2010). The proposed models try to capture the dependencies between 
solvers, problems, and run durations. Each instance from the training set is solved 
many times by each solver and a model is fit to the outcomes observed in the 
training phase using the iterative expectation-maximization algorithm. During the 
testing phase, the model is updated based on new outcomes. The procedure for 
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algorithm selection chooses a solver and runtime duration trying to optimize dis- 
counted utility measure on the long run. The authors report that their system is 
roughly comparable to SATzilla. 

Non-model-based portfolios. This, most recent, approach (Malitsky, Sabharwal, Samu- 
lowitz, & Sellmann, 2011) also relies on k-nearest neighbors, and was independently 
developed in parallel with our research. However, the two systems differ in some 
aspects (which will be shown to be important). This approach uses a standard 
Euclidean distance to measure the distance of neighbors, while each feature has 
to be scaled to the interval [0,1] to avoid dependence on order of magnitude of 
numbers involved. Also, the feature set is somewhat different from the one we use. 
The approach was evaluated on random instances from SAT 2009 competition and 
gave better results than SATzilla. 



3 Nearest Neighbor-Based Algorithm Portfolio 

The existing portfolio systems for SAT build their models (e.g., runtime prediction 
models, grouping of instances, etc.) in advance, regardless of characteristics of the 
input instance. We expected that a finer algorithm selection might be achievable 
if a local, input-specific model is built and used. A simple model of that sort can 
be obtained by the fc-nearest neighbor method (Duda, Hart, & Stork, 2001), from 
just few instances similar to the instance being solved. In the rest of this section 
we describe our algorithm portfolio for SAT. 

It is assumed that a training set of instances is solved by all solvers from the 
portfolio, and that the solving times (within a given cutoff time) are available. 
Based on these solving times, for each solver a penalty can be calculated for any 
instance (the greater the solving time, the greater the penalty). Each instance is 
represented by a vector of features. 

Our algorithm selection technique is given in Figure 1. Basically, for a new 
instance to be solved, its k-nearest neighbors from the training set (with respect to 
some distance measure^ are found, and the solver with the minimal penalty for those 
instances is invoked. In the case of ties among several solvers, one of them that 
performs the best on the whole training set can be chosen.^ 

To make the method concrete, the set of features, the penalty and the distance 
measure have to be defined. 

Features. The authors of SATzilla introduced 96 features that are used to char- 
acterize SAT instance (Xu et al., 2008, 2009), used subsequently also by other 
systems (Nikolic et al., 2009; Kadioglu et al., 2010). The main problem with using 
a full set of these features is the time needed to compute them for large instances.^ 
The features we chose, given in Figure 2, can be computed very quickly. They are 
some of the purely syntactical ones used by SATzilla. Though this subset may not 
be enough for good runtime prediction that SATzilla is based on, it may serve well 
for algorithm selection. 



^ In practice, it is highly unlikely that there are more than one such solver, but for com- 
pleteness we allow for such possibility (step 5 of the procedure) . 

As said, SATzilla even performs a feature computation time prediction and docs the com- 
putation itself only if the predicted time does not exceed 2 minutes. 
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5: Set of available solvers 

T: Set of feature vectors and solving times for each training instance 

k: Number of neighbors to be considered (fe < |T|) 

i: Input instance 

(Solver selection) 

1. / <— f eatures(j) 

2. T' <— set of k instances from T nearest to / 

3. S' •(- {s e S I penalty{s,T') = mmg/g5penalty(s', T')} 

(Resolution of ties among solvers from 5") 

4. S* <- {s G S' I penalty(s,T) = ■mingi^gipenaltj{s' ,T)} 

(Solving) 

5. Solve i using any s E S* 

Fig. 1 fc-nearest neighbors algorithm portfolio for SAT. 

Problem Size Features: 

1-3. Number of clauses c, Number of variables v, Ratio v/c 

Variable-Clause Graph Features: 

4-8. Variable nodes degree statistics: mean, variation coefficient, min, max, 
and entropy. 
9-13. Clause nodes degree statistics: mean, variation coefficient, min, max, 
and entropy. 

Balance Features: 

14-16. Ratio of positive and negative literals in each clause: mean, variation 
coefficient, and entropy. 

17-21. Ratio of positive and negative occurrences each variable: mean, varia- 
tion coefficient, min, max and entropy. 

22-23. Fraction of binary and ternary clauses. 

Proximity to Horn Formula: 

24. Fraction of Horn clauses 
25-29. Number of occurrences in a Horn clause for each variable: mean, vari- 
ation coefficient, min, max, and entropy. 

Fig. 2 SATzilla features used. 



Penalty. If a solving time for a solver and for a given instance is less that a given 
cutoff time, the penalty for the solver on that instance is the solving time. If it 
is greater then the cutoff time, for the penalty time we take 10 times the cutoff 
time. This is the PARIO score (Hutter, Hoos, Leyton-Brown, & Stiitzle, 2009). We 
define a PARIO score of a solver on a set of instances to be the sum of its PARIO 
scores on individual instances. 



Distance measure. We prefer the distance measure that performed well for ArgoS- 
mArT: 



d{x,v) = Y^^ 



Vil 



where Xi and j/^ are coordinates of vectors x and y (containing feature values of 
the instance), respectively. However, any distance measure could be used. 



Mladcn Nikolic ct al. 



Compared to the approaches described in Section 2, our procedure does not 
discriminate between satisfiable and unsatisfiable or between random, crafted, and 
industrial instances. The procedure does not use presolvers, does not predict fea- 
ture computation time, nor it uses any feature selection or feature generation 
mechanisms. It is not assumed that the structure of instance families is given in 
advance, nor it is constructed in any way. Also, the algorithm does not use any 
advanced statistical techniques, nor does solve the same instances several times. 
Compared to the approach of Malitsky et al. (Malitsky et al., 2011), we use a 
smaller feature set, different distance measure and avoid feature scaling. 

Note that the special case of 1-nearest neighbor technique, has some advantages 
compared to the general case. Apart for simpler implementation, it can have a 
wider range of application. In the case of algorithm configuration selection (that 
can be seen as a special case of algorithm selection — each configuration of an 
algorithm can be considered as a different algorithm), it would be expensive or 
practically impossible to have each instance solved for all algorithm configurations. 
Therefore, neither SATzilla nor fc-nearest neighbor approach for A: > 1 is applicable 
in this situation. However, there are special optimization based techniques for 
finding good solver configurations off-line (Hutter et al., 2009; Ansotegui et al., 
2009). Hence, for each instance, one good configuration can be known. This is 
sufficient for the 1-nearest neighbor approach to be used. 



4 Implementation and Experimental Evaluation 

Our implementation of the presented algorithm portfolio for SAT, ArgoSmArT 
k-NN,"* consists of less than 2000 lines of C+-|- code. The core part, concerned 
with the solver selection, has around 100 lines of code, while the rest is a feature 
computation code, solver initialization and invocation, time measurement, etc. All 
the auxiliary code (everything except the solver selection mechanism) is derived 
from the SATzilla source code by removing or simplifying parts of its code. 

In the evaluation we compare ArgoSmArT k-NN with SATzilla 2009. We are 
not aware of a publicly available implementation related to the approach of Mal- 
itsky at al., but we compare different decisions in our two approaches within Ar- 
goSmArT k-NN. 

Instead of solving instances from a training set, the training data for SATzilla 
2009^, available from the SATzilla web site®, was used. SATzilla was trained using 
5883 instances from SAT Competitions (2002-2005 and 2007) and SAT Races 2006 
and 2008 (Xu et al., 2009). The data available on the web site include the solv- 
ing information for 4701 instances (the solving data for other instances SATzilla 
is trained on is not available on the web). The available solving times of these 
instances were used, while the feature values were recomputed (in order to avoid 
using the SatELite preprocessor that SATzilla 2009 uses as a first step of feature 
computation). The cutoff time of 1500s was used. When instances that where not 
solved by any solver within that time limit and when duplicate instances were ex- 
cluded, there were 4276 remaining instances in the training set. The feature vectors 



* The source eode and the binary are available from http://argo.matf.bg.ae.rs/ download 
section. 
® SATzilla 2009 is a winner of SAT Competition 2009 in random category. 
^ http : //www . cs . ubc . ca/labs/beta/Proj ects/SATzilla/ 
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of training instances and their solving times for all solvers used are included in 
the implementation of ArgoSmArT (1.3Mb of data). As a test set, we used all the 
instances from the SAT Competition 2009. 

There are 13 solvers for which the solving data are available, and that are used 
by SATzilla 2009 as components solvers in 3 versions of SATzilla (random, crafted, 
and industrial) (Xu et al., 2009). Each version of SATzilla uses a specialized subset 
of these 13 solvers that is detected to perform the best on each category of in- 
stances. No specialized versions of ArgoSmArT are made, but simply all 13 solvers 
are used as component solvers. 

ArgoSmArT k-NN can use any distance measure, but we take into consider- 
ation two measures. The first is the one shown in Section 3 that performed best 
for ArgoSmArT, and for some other problems (Tomovic, Janicic, & Keselj, 2006). 
The second one is the Euclidean distance (with features being scaled to [0,1]) as 
used by Malitsky et al. The comparison of these distances for various k is shown 
in Figure 3. The number of solved instances for each distance and each k is ob- 
tained by leave one out procedure (the solver to be used for each instance is chosen 
by excluding the instance from the training set, and applying the solver selection 
procedure using the rest of the training set). It is obvious that ArgoSmArT dis- 
tance is uniformly better than the Euclidean one. For both distances, the highest 
number of solved instances is obtained for fe = 9. Hence, we use these choices for 
ArgoSmArT k-NN in further evaluation. Also, we justify our choice of features by 
measuring the feature computation times. For the full set of 48 features used by 
Malitsky et al. , minimal, average and maximal computation times on the training 
set are 0.002, 19.2 and 6257 seconds. For our, reduced, set of features the minimal, 
average and maximal computation times are 0.0017, 0.45, and 51.2 seconds. 

An experimental comparison between ArgoSmArT k-NN and any individual 
version of SATzilla would not be fair since each version is designed specifically for 
one kind of instances. So, in order to make a fair comparison, on random instances 
we used SATzilla random, on crafted instances we used SATzilla crafted, and 
on industrial instances we used SATzilla industrial. This virtual SATzilla system 
will be just referred to as SATzilla. In our experimental comparison, we included 
MXC08 (the best single solver on the training set), SATzilla, the ArgoSmArT 
system based on (Nikolic et al., 2009) with 13 SATzillasolvers instead of ArgoSAT 
configurations, ArgoSmArT 1-NN, and ArgoSmArT 9-NN. Also, we compare to 
the virtual best solver — a virtual solver that solves each instance by the fastest 
available solver for that instance (showing the upper achievable limit). Experi- 
ments were performed on a cluster with 32 dual core 2GHz Intel Xeon processors 
with 2GB RAM per processor. The results are given in Table 1 and they show that 
ArgoSmArT 1-NN/ArgoSmArT 9-NN outperformed all other solvers/portfolios in 
all categories. 

It is a common practice on SAT Competitions and SAT Races to repeat in- 
stances known from previous events. This results in overlapping of training and 
test set. To be thorough, in Table 2, we provide experimental evaluation on the 
test set without the instances contained in the training set. 

One can observe that on one subset of instances (industrial instances that did 
not appear on earlier competitions), MXC08 component solver performs better 
than all the portfolio approaches. This probably means that the test set is some- 
what biased with respect to the training set. However, the presented results show 
that ArgoSmArT k-NN significantly outperformed other entrants on this test set 



Mladcn Nikolic et al. 




ArgoSmArT distance 
Normalized Euclidean distance 



Fig. 3 The number of solved instances from the training set for ArgoSmArT using each of 
the compared distances for each k from 1 to 30. 



MXC08 SATzilla ArgoSmArT ArgoSmArT 1-NN ArgoSmArT 9-NN VBS 



TimsALL 


> 1500s 


635s 


874s 


390s 


Nall 


355 


635 


609 


685 


Nrnd 


84 


375 


308 


367 


NCRF 


124 


128 


154 


158 


NiND 


147 


132 


147 


160 



353s 


115s 


692 


816 


390 


454 


149 


188 


153 


174 



Table 1 Experimental results on instances from SAT Competition 2009. For each 
solver/portfolio the number of solved instances and the median solving time are given for 
the whole corpus. Also, the number of solved instances is given for each of the categories of 
instances — random, crafted, and industrial. The total number of instances is 1143. 
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Table 2 Experimental results on instances from SAT Competition 2009 without the instances 
known from previous SAT Competitions and SAT Races. For each solver/portfolio the number 
of solved instances and the median solving time are given for the whole corpus. Also, the 
number of solved instances is given for each of the categories of instances — random, crafted, 
and industrial. The total number of instances is 894. 
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as well. Possible reasons for this involve two characteristics of ArgoSmArT k-NN. 
First, in contrast to the original ArgoSmArT and SATzilla, ArgoSmArT k-NN does 
not use predefined groups or precomputed prediction models, built regardless of 
the input instance. Instead, ArgoSmArT k-NN selects a solver by considering only 
a local set of instances similar to the input instance. Second, ArgoSmArT and 
SATzilla make their choices by considering specific groups of instances, but these 
groups are typically large. On the other hand, ArgoSmArT k-NN considers only 
a very small number of instances (e.g., fc = 9) and this eliminates influence of less 
relevant instances. Indeed, SATzilla improves its predictive performance by build- 
ing specific versions for smaller sets of related instances (i.e., random, crafted, 
industrial) (Xu et al., 2008), which also supports the above speculation. 



5 Conclusions 

We presented a strikingly simple algorithm portfolio for SAT stemming from our 
work on ArgoSmArT (Nikolic et al., 2009). The presented system, ArgoSmArT k- 
NN, benefits from the SATzilla system in several ways: it uses a subset of SATzilla 
features for representation of instances, a selection of SAT solvers, SATzilla solving 
data for the training corpus, and fragments of SATzilla implementation. However, 
in its core part — selection of a solver to be used — ArgoSmArT k-NN significantly 
differs from SATzilla. Instead of predicting runtimes, our system selects a solver 
based on the knowledge about instances from a local neighborhood of the input 
instance, using the fc-nearest neighbors method. The proposed system is imple- 
mented and publicly available as open source. The experimental evaluation shows 
that the particular decisions made in the design of our system are even better than 
the decisions made in the similar recent system (Malitsky et al., 2011). Also, it 
(even the simplest version ArgoSmArT 1-NN) performed substantially better than 
SATzilla — the most successful, but rather complex, algorithm portfolio for SAT. 
We believe that the presented approach proves there is a room for improving of 
algorithm portfolio systems for SAT, not necessarily by overengineering. 
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